home *** CD-ROM | disk | FTP | other *** search
- Path: solon.com!not-for-mail
- From: thads@csn.net (Thad Smith)
- Newsgroups: comp.lang.c,comp.lang.c.moderated
- Subject: Re: Floating Point arithmetic problem
- Date: 15 Feb 1996 07:57:20 -0600
- Organization: T3 Systems
- Sender: clc@solutions.solon.com
- Approved: clc@solutions.solon.com
- Message-ID: <4fve40$dli@solutions.solon.com>
- References: <c968da6jzm.fsf@damayanti.india.ti.com>
- Reply-To: ThadSmith@acm.org
- NNTP-Posting-Host: solutions.solon.com
-
- In article <c968da6jzm.fsf@damayanti.india.ti.com>,
- kuntal@india.ti.com (Kuntal Shah) wrote:
- >
- >This addition usually results in a few insignificant digits getting
- >accumulated in my double variable. For eg.
- >
- > f = 12.99 is represented as 12.9900000000000002131628207
- > f = 11.111 is represented as 11.1110000000000006536993169
- >
- >and so on. I am interested in comparisons with 4 digits precision.
- >
- >Now coming to the problem. The insignificant digits due to the
- >floating point representation keep accruing and there comes a stage
- >when the accrued value exceeds 0.0001 which results in failure of the
- >if condition in the above block of code, when ideally no such thing
- >should have occurred.
-
- You are running into a fundamental problem with floating point
- numbers. Assume that your values have a magnitude of 1e4 (which you
- say is the limit). Assume further that 8-byte IEEE-754 arithmetic is
- used. With approximately 17 digits of precision, the lsb is
- approximately 1e-13. If you add 1e6 numbers and the contributed
- rounding error from each addition is 1/2 lsb (a hand-waving, but
- probably reasonable sort of assumption), the accumulated error would
- be on the order of 5e-8. That is better than your tolerance of 1e-4.
-
- The retained precision also depends on the order in which the numbers
- are added. The above analysis assumes that the accumulated value
- stays less than 1e4. If, instead, you added 5e5 positive values of
- approximately 1e4, THEN 5e5 negative values of opposite value which
- result in a theoretical sum near zero, the accumulated value will
- temporarily go to approximately 5e9, where the typical 1/2 lsb error
- would then be about 5e-9. Multiply that by 1e6 and get 5e-3, which is
- greater than you tolerance.
-
- >All I need is a solution that will overcome this problem. Please
- bear
- >in mind that the loop is executed millions of times and hence any
- >costly operation within the loop with drastically bring down
- >performance.
-
- Avoid accumulating large values which will eventually cancel. See if
- your implementation supports a long double type with greater
- precision than double.
-
- >I have a few options to overcome this problem :-
- >
- >* After each addition, covert 'd' to an unsigned long after
- > multiplying by say 1e8, (thus truncating the unnecessary digits),
- > and divide it by 1e8 to get back the original value.
-
- That would, for arbitrary operands, make matters much worse by adding
- additional noise. Your examples, though, suggest that all your values
- might be multiples of .001 or .0001. If that is the case, you would
- be much better off scaling your values by a multiplier that results in
- all values being integers, i.e., multiplying by 10000, before summing
- them. This will reduce the noise produced when adding the numbers,
- probably enough to give you a perfect solution. Obviously you need to
- then divide the final sum by the scaling value to get your intended
- sum.
-
- Thad
-